Skip to content

Conversation

@steFaiz
Copy link
Contributor

@steFaiz steFaiz commented Jan 6, 2026

Purpose

Part of #6834

This PR is about to support creating btree index through current spark CreateGlobalIndexProcedure.

The whole process is illustrated as below:
image

  1. At first, all indexing column values as well as their related row ids from specified partitions are scanned
  2. All data will be range shuffled and sorted by <partition, indexed field>
  3. Each spark partition will contains disjoint key-ranges and each writer is capable of writing key ranges for multiple partitions. The spark partition num is controlled by records-per-range and max-parallelism option.
    Note that the effective number of records of each btree file would not be precisely equal to records-per-range, the reason is that: (1) spark range shuffle is implemented through sampling. (2) if a Paimon partition spans multiple Spark partitions, the first and last output files may contain relatively few records (As the green-colored index writers in the picture before).
  4. Finally the driver will collect all commit messages.

Tests

Please see org.apache.paimon.spark.procedure.CreateGlobalIndexProcedureTest for ut test.

API and Format

This pr do not modify any existing public api.

Documentation

Will be added ASAP

@steFaiz steFaiz closed this Jan 6, 2026
@steFaiz steFaiz reopened this Jan 6, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces support for building BTree global indexes in Spark through the existing CreateGlobalIndexProcedure. BTree indexes provide efficient point lookups and range queries for high-cardinality data types like integers, doubles, and strings, complementing the existing Bitmap index implementation.

Key changes:

  • Implements a custom topology builder (BTreeIndexTopoBuilder) that uses Spark's range shuffle and sorting capabilities to distribute index building across partitions
  • Refactors GlobalIndexBuilder from concrete class to abstract class with separate implementations for default (bitmap) and BTree indexes
  • Adds configuration options for controlling BTree index parallelism and records per range
  • Fixes serialization bug where extraFieldIds null check was incorrectly checking indexMeta field instead

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
BTreeIndexTopoBuilder.java Implements distributed BTree index building using Spark range shuffle and sort; orchestrates parallel index file generation
BTreeGlobalIndexBuilder.java Handles per-partition BTree index file writing with automatic flushing based on record count or partition boundaries
BTreeGlobalIndexBuilderFactory.java Factory class for creating BTree index builders and topology builders via service loader pattern
IndexFieldsExtractor.java Utility class for extracting partition, index field, and row ID from records during index building
GlobalIndexTopoBuilder.java Interface change to support custom topology builders with direct access to SparkSession and data sources
GlobalIndexBuilderContext.java Enhanced context to support nullable partition info and full range tracking for BTree indexes
GlobalIndexBuilder.java Refactored to abstract class with iterator-based build method, supporting both singleton and parallel writers
DefaultGlobalIndexBuilder.java Extracted default (bitmap) index building logic from the original GlobalIndexBuilder
CreateGlobalIndexProcedure.java Modified to support custom topology builders that bypass traditional shard-based splitting
BTreeIndexOptions.java Adds configuration options for records per range and max parallelism; fixes typo in compression-level key
IndexManifestEntrySerializer.java Fixes bug where null check incorrectly evaluated indexMeta instead of extraFieldIds
IndexFileMetaSerializer.java Fixes same serialization bug as IndexManifestEntrySerializer
CreateGlobalIndexProcedureTest.scala Adds comprehensive tests for BTree index creation with single and multiple partitions, including overlap detection
Service files Registers BTreeGlobalIndexerFactory and BTreeGlobalIndexBuilderFactory for service loader discovery

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@steFaiz steFaiz closed this Jan 6, 2026
@steFaiz steFaiz reopened this Jan 6, 2026
@JingsongLi JingsongLi closed this Jan 7, 2026
@JingsongLi JingsongLi reopened this Jan 7, 2026
Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 thanks @steFaiz

@JingsongLi JingsongLi merged commit 22bc3ab into apache:master Jan 7, 2026
37 of 76 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants